Methodological Considerations in Estimation of Phenotype Heritability Using Genome-Wide SNP Data, Illustrated by an Analysis of the Heritability of Height in a Large Sample of African Ancestry Adults
نویسندگان
چکیده
Height has an extremely polygenic pattern of inheritance. Genome-wide association studies (GWAS) have revealed hundreds of common variants that are associated with human height at genome-wide levels of significance. However, only a small fraction of phenotypic variation can be explained by the aggregate of these common variants. In a large study of African-American men and women (n = 14,419), we genotyped and analyzed 966,578 autosomal SNPs across the entire genome using a linear mixed model variance components approach implemented in the program GCTA (Yang et al Nat Genet 2010), and estimated an additive heritability of 44.7% (se: 3.7%) for this phenotype in a sample of evidently unrelated individuals. While this estimated value is similar to that given by Yang et al in their analyses, we remain concerned about two related issues: (1) whether in the complete absence of hidden relatedness, variance components methods have adequate power to estimate heritability when a very large number of SNPs are used in the analysis; and (2) whether estimation of heritability may be biased, in real studies, by low levels of residual hidden relatedness. We addressed the first question in a semi-analytic fashion by directly simulating the distribution of the score statistic for a test of zero heritability with and without low levels of relatedness. The second question was addressed by a very careful comparison of the behavior of estimated heritability for both observed (self-reported) height and simulated phenotypes compared to imputation R2 as a function of the number of SNPs used in the analysis. These simulations help to address the important question about whether today's GWAS SNPs will remain useful for imputing causal variants that are discovered using very large sample sizes in future studies of height, or whether the causal variants themselves will need to be genotyped de novo in order to build a prediction model that ultimately captures a large fraction of the variability of height, and by implication other complex phenotypes. Our overall conclusions are that when study sizes are quite large (5,000 or so) the additive heritability estimate for height is not apparently biased upwards using the linear mixed model; however there is evidence in our simulation that a very large number of causal variants (many thousands) each with very small effect on phenotypic variance will need to be discovered to fill the gap between the heritability explained by known versus unknown causal variants. We conclude that today's GWAS data will remain useful in the future for causal variant prediction, but that finding the causal variants that need to be predicted may be extremely laborious.
منابع مشابه
Massively expedited genome-wide heritability analysis (MEGHA).
The discovery and prioritization of heritable phenotypes is a computational challenge in a variety of settings, including neuroimaging genetics and analyses of the vast phenotypic repositories in electronic health record systems and population-based biobanks. Classical estimates of heritability require twin or pedigree data, which can be costly and difficult to acquire. Genome-wide complex trai...
متن کاملLimitations of GCTA as a solution to the missing heritability problem.
Genome-wide association studies (GWASs) seek to understand the relationship between complex phenotype(s) (e.g., height) and up to millions of single-nucleotide polymorphisms (SNPs). Early analyses of GWASs are commonly believed to have "missed" much of the additive genetic variance estimated from correlations between relatives. A more recent method, genome-wide complex trait analysis (GCTA), ob...
متن کاملEstimation of Genotypic Correlation and Heritability of some of Traits in Faba Bean Genotypes Using Restricted Maximum Likelihood (REML)
In order to estimation genotypic correlation and heritability of some faba bean traits, 26 faba bean genotypes were evaluated in a randomized complete block design with three replications during 2014-16 growing seasons in Agricultural Research Sation of Borujerd located in Lorestan province, Iran. The restricted maximum likelihood (REML) was used to estimate the genotypic and phenotypic correla...
متن کاملHeritability parameters for some body measurements in Turkish Arabian foals
The aim of the present study was to estimate genetic parameters of body measurements in TurkishArabian foals. Records of wither height, front cannon circumference, heart girth, and body length were usedfor body measurements at birth, 6 months of age, 1-year-old and 2-year-old. Heritability of bodymeasurements of Turkish Arabian foals was estimated by using records of foals sired by stallions wh...
متن کاملThe Impact of Different Genetic Architectures on Accuracy of Genomic Selection Using Three Bayesian Methods
Genome-wide evaluation uses the associations of a large number of single nucleotide polymorphism (SNP) markers across the whole genome and then combines the statistical methods with genomic data to predict the genetic values. Genomic predictions relieson linkage disequilibrium (LD) between genetic markers and quantitative trait loci (QTL) in a population. Methods that use all markers simultaneo...
متن کامل